Faster Algorithms for Privately Releasing Marginals

نویسندگان

  • Justin Thaler
  • Jonathan Ullman
  • Salil P. Vadhan
چکیده

We study the problem of releasing k-way marginals of a database D ∈ ({0, 1}), while preserving differential privacy. The answer to a k-way marginal query is the fraction of D’s records x ∈ {0, 1} with a given value in each of a given set of up to k columns. Marginal queries enable a rich class of statistical analyses of a dataset, and designing efficient algorithms for privately releasing marginal queries has been identified as an important open problem in private data analysis (cf. Barak et. al., PODS ’07). We give an algorithm that runs in time d √ k) and releases a private summary capable of answering any k-way marginal query with at most ±.01 error on every query as long as n ≥ d √ . To our knowledge, ours is the first algorithm capable of privately releasing marginal queries with non-trivial worst-case accuracy guarantees in time substantially smaller than the number of k-way marginal queries, which is d (for k d).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

2 5 Ju n 20 12 Faster Algorithms for Privately Releasing Marginals ∗

We study the problem of releasing k-way marginals of a database D ∈ ({0, 1}d)n, while preserving differential privacy. The answer to a k-way marginal query is the fraction of D’s records x ∈ {0, 1}d with a given value in each of a given set of up to k columns. Marginal queries enable a rich class of statistical analyses of a dataset, and designing efficient algorithms for privately releasing ma...

متن کامل

Efficient Algorithms for Privately Releasing Marginals via Convex Relaxations

Consider a database of n people, each represented by a bit-string of length d corresponding to the setting of d binary attributes. A k-way marginal query is specified by a subset S of k attributes, and a |S|-dimensional binary vector β specifying their values. The result for this query is a count of the number of people in the database whose attribute vector restricted to S agrees with β. Priva...

متن کامل

Marginal Release Under Local Differential Privacy

Many analysis and machine learning tasks require the availability of marginal statistics on multidimensional datasets while providing strong privacy guarantees for the data subjects. Applications for these statistics range from finding correlations in the data to fitting sophisticated prediction models. In this paper, we provide a set of algorithms for materializing marginal statistics under th...

متن کامل

. D S ] 1 3 A pr 2 01 3 Faster Private Release of Marginals on Small Databases ∗

We study the problem of answering k-way marginal queries on a database D ∈ ({0, 1}d)n, while preserving differential privacy. The answer to a k-way marginal query is the fraction of the database’s records x ∈ {0, 1}d with a given value in each of a given set of up to k columns. Marginal queries enable a rich class of statistical analyses on a dataset, and designing efficient algorithms for priv...

متن کامل

Efficient graphical models for sequence segmentation

Segmentation of sequences is an important modeling primitive with several applications. Training and inference of segmentation models involves dynamic programming computations that in the worst case can be cubic in the length of a sequence. In contrast, typical sequence labeling models require linear time. We propose an alternative graphical model for efficient sharing of potentials across over...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012